Pesquisa | Portal Regional da BVS

1.

Machine learning based estimation of hoarseness severity using sustained vowelsa).

Schraut, Tobias; Schützenberger, Anne; Arias-Vergara, Tomás; Kunduk, Melda; Echternach, Matthias; Döllinger, Michael.

J Acoust Soc Am ; 155(1): 381-395, 2024 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-38240668

RESUMO

Auditory perceptual evaluation is considered the gold standard for assessing voice quality, but its reliability is limited due to inter-rater variability and coarse rating scales. This study investigates a continuous, objective approach to evaluate hoarseness severity combining machine learning (ML) and sustained phonation. For this purpose, 635 acoustic recordings of the sustained vowel /a/ and subjective ratings based on the roughness, breathiness, and hoarseness scale were collected from 595 subjects. A total of 50 temporal, spectral, and cepstral features were extracted from each recording and used to identify suitable ML algorithms. Using variance and correlation analysis followed by backward elimination, a subset of relevant features was selected. Recordings were classified into two levels of hoarseness, H<2 and H≥2, yielding a continuous probability score y∈[0,1]. An accuracy of 0.867 and a correlation of 0.805 between the model's predictions and subjective ratings was obtained using only five acoustic features and logistic regression (LR). Further examination of recordings pre- and post-treatment revealed high qualitative agreement with the change in subjectively determined hoarseness levels. Quantitatively, a moderate correlation of 0.567 was obtained. This quantitative approach to hoarseness severity estimation shows promising results and potential for improving the assessment of voice quality.

Assuntos

Disfonia , Rouquidão , Humanos , Rouquidão/diagnóstico , Reprodutibilidade dos Testes , Qualidade da Voz , Fonação , Acústica , Acústica da Fala , Medida da Produção da Fala

2.

The Effects of a 'New Generation' of Heat and Moisture Exchangers in Laryngectomized Patients with Previous Heat and Moisture Exchanger Experience.

Almajali, Omar; Balk, Matthias; Rupp, Robin; Allner, Moritz; Sievert, Matti; Iro, Heinrich; Schützenberger, Anne; Gostian, Antoniu-Oreste.

Ear Nose Throat J ; : 1455613231200769, 2023 Sep 29.

Artigo em Inglês | MEDLINE | ID: mdl-37776012

RESUMO

Objectives: To evaluate the effects of a new generation of heat and moisture exchangers (NG-HMEs) on pulmonary rehabilitation, quality of life, patient satisfaction, and usage patterns. Methods: A prospective observational study on 23 laryngectomized patients with prior HME experience from June 1, 2021 to November 30, 2021. Patients were interviewed at inclusion, after 6 weeks and after 12 weeks after the introduction of NG-HMEs. Two validated questionnaires were used to report pulmonary complaints and quality of life: the Cough and Sputum Assessment Questionnaire (CASA-Q), the European Quality of Life 5 Dimensions Index Score (EQ-5D Index Score), and the European Quality of Life 5 Dimensions Visual Analog Scale (EQ-5D-VAS). Usage patterns and patient satisfaction were reported using study-specific questionnaires. Results: The patients had an average age of 65.7 ± 6.8 years, with 87% being male, on average 33.7 ± 35.3 months after total laryngectomy (TLE). NG-HMEs were used for a mean of 21.87 ± 4.63 hours/day (P = .034). After 12 weeks of use, patients reported the following changes in the CASA-Q domains: cough symptoms (+5; P = .663), cough impact (0; P = .958), sputum symptoms (+8; P = .13), and sputum impact (+3; P = .489). The EQ-5D index score increased (+0.024; P = .917) as well as the EQ-5D VAS (+0.8; P = .27). All patients rated their experience with NG-HMEs with ≥3 out of 5. The patients who used NG-HMEs as instructed (n = 13) reported more profound changes in the CASA-Q domains: cough symptom (+11; P = .129), cough impact (+7; P = .209), sputum symptom (+11; P = .123), and sputum impact (+10; P = .102). Conclusions: Our results show that NG-HMEs could have a positive clinical impact on pulmonary rehabilitation after TLE, even in HME-experienced patients. The use of NG-HMEs does not affect the quality of life. The possible effects of NG-HMEs require further evaluation in long-term studies to fully assess their efficacy.

3.

Extent and Effect of Covering Laryngeal Structures with Synthetic Laryngeal Mucus via Two Different Administration Techniques.

Semmler, Marion; Lasar, Sarina; Kremer, Franziska; Reinwald, Laura; Wittig, Fiori; Peters, Gregor; Schraut, Tobias; Wendler, Olaf; Seyferth, Stefan; Schützenberger, Anne; Dürr, Stephan.

J Voice ; 2023 Aug 28.

Artigo em Inglês | MEDLINE | ID: mdl-37648625

RESUMO

OBJECTIVE: The first goal of this study was to investigate the coverage of laryngeal structures using two potential administration techniques for synthetic mucus: inhalation and lozenge ingestion. As a second research question, the study investigated the potential effects of these techniques on standardized voice assessment parameters. METHODS: Fluorescein was added to throat lozenges and to an inhalation solution to visualize the coverage of laryngeal structures through blue light imaging. The study included 70 vocally healthy subjects. Fifty subjects underwent administration via lozenge ingestion and 20 subjects performed the inhalation process. For the first research question, the recordings from the blue light imaging system were categorized to compare the extent of coverage on individual laryngeal structures objectively. Secondly, a standardized voice evaluation protocol was performed before and after each administration to determine any measurable effects of typical voice parameters. RESULTS: The administration via inhalation demonstrated complete coverage of all laryngeal structures, including the vocal folds, ventricular folds, and arytenoid cartilages, as visualized by the fluorescent dye. In contrast, the application of the lozenge predominantly covered the pharynx and laryngeal surface toward the aryepiglottic fold, but not the inferior structures. All in all, the comparison before and after administration showed no clear effect, although a minor deterioration of the acoustic signal was noted in the shimmer and cepstral peak prominence after the inhalation. CONCLUSIONS: Our findings indicate that the inhalation process is a more effective technique for covering deeper laryngeal structures such as the vocal folds and ventricular folds with synthetic mucus. This knowledge enables further in vivo studies on the role of laryngeal mucus in phonation in general, and how it can be substituted or supplemented for patients with reduced glandular activity as well as for heavy voice users.

4.

Mechanical Parameters Based on High-Speed Videoendoscopy of the Vocal Folds in Patients With Ectodermal Dysplasia.

Pelka, Franziska; Ensthaler, Maria; Wendler, Olaf; Kniesburges, Stefan; Schützenberger, Anne; Semmler, Marion.

J Voice ; 2023 Mar 25.

Artigo em Inglês | MEDLINE | ID: mdl-36973131

RESUMO

OBJECTIVE: Patients suffering from ectodermal dysplasia (ED), which is an inherited disorder in the development of the ectodermal structures, have a significantly reduced expression of teeth, hair, sweat glands, and salivary glands in the respiratory tract including the larynx. Previous studies within the framework of the present project showed a significantly reduced saliva production and an impairment of the acoustic outcome in ED patients compared to the control group. However, until now, no statistically significant difference between EDs and controls could be found regarding vocal fold dynamics in the high-speed videoendoscopy (HSV) recordings using representative parameters on closure, symmetry, and periodicity. The aim of this study is to examine the role of tissue characteristics by means of objective mechanical parameters derived from HSV recordings. METHODS: This study includes 28 ED patients and 42 controls (no ED, healthy voice). The vocal fold oscillations were recorded by high-speed videoendoscopy (HSV@4kHz). Based on the dynamical measures of the glottal area waveform (GAW), objective glottal dynamic parameters associated with tissue properties like flexibility and stiffness were computed. RESULTS: The present evaluation displays a significant difference between male ED patients and male controls concerning the HSV-based mechanical parameters indicating reduced stiffness and increased deformability for the vocal folds of male ED patients. In contrast to strongly amplitude-dependent parameters, the primarily velocity-based parameters showed no statistically significant deviation. CONCLUSIONS: The presented data provides the first promising indication toward the underlying causes on the laryngeal level leading to the voice conspicuities in ED patients. The significant difference concerning the mechanical parameters suggests a different composition of the extracellular matrix of the tissue of the vocal folds of ED patients compared to controls.

5.

Nyquist Plot Parametrization for Quantitative Analysis of Vibration of the Vocal Folds.

Arias-Vergara, Tomás; Döllinger, Michael; Schraut, Tobias; Mohd Khairuddin, Khairy Anuar; Schützenberger, Anne.

J Voice ; 2023 Feb 09.

Artigo em Inglês | MEDLINE | ID: mdl-36774264

RESUMO

OBJECTIVES: The Nyquist plot provides a graphical representation of the glottal cycles as elliptical trajectories in a 2D plane. This study proposes a methodology to parameterize the Nyquist plot with application to support the quantitative analysis of voice disorders. METHODS: We considered high-speed videoendoscopy recordings of 33 functional dysphonia (FD) patients and 33 normophonic controls (NC). Quantitative analysis was performed by computing four shape-based parameters from the Nyquist plot: Variability, Size (Perimeter and Area), and Consistency. Additionally, we performed automatic classification using a linear support vector machine and feature importance analysis by combining the proposed features with state-of-the-art glottal area waveform (GAW) parameters. RESULTS: We found that the inter-cycle variability was significantly higher in FD patients compared to NC. We achieved a classification accuracy of 83% when the top 30 most important features were used. Furthermore, the proposed Nyquist plot features were ranked in the top 12 most important features. CONCLUSIONS: The Nyquist plot provides complementary information for subjective and objective assessment of voice disorders. On the one hand, with visual inspection it is possible to observe intra- and inter-glottal cycle irregularities during sustained phonation. On the other hand, shaped-based parameters allow quantifying such irregularities and provide complementary information to state-of-the-art GAW parameters.

6.

GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks.

Kruse, Elina; Dollinger, Michael; Schutzenberger, Anne; Kist, Andreas M.

IEEE J Transl Eng Health Med ; 11: 137-144, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36816097

RESUMO

High-speed videoendoscopy is a major tool for quantitative laryngology. Glottis segmentation and glottal midline detection are crucial for computing vocal fold-specific, quantitative parameters. However, fully automated solutions show limited clinical applicability. Especially unbiased glottal midline detection remains a challenging problem. We developed a multitask deep neural network for glottis segmentation and glottal midline detection. We used techniques from pose estimation to estimate the anterior and posterior points in endoscopy images. Neural networks were set up in TensorFlow/Keras and trained and evaluated with the BAGLS dataset. We found that a dual decoder deep neural network termed GlottisNetV2 outperforms the previously proposed GlottisNet in terms of MAPE on the test dataset (1.85% to 6.3%) while converging faster. Using various hyperparameter tunings, we allow fast and directed training. Using temporal variant data on an additional data set designed for this task, we can improve the median prediction accuracy from 2.1% to 1.76% when using 12 consecutive frames and additional temporal filtering. We found that temporal glottal midline detection using a dual decoder architecture together with keypoint estimation allows accurate midline prediction. We show that our proposed architecture allows stable and reliable glottal midline predictions ready for clinical use and analysis of symmetry measures.

Assuntos

Glote , Prega Vocal , Redes Neurais de Computação , Endoscopia

7.

Influence of Reduced Saliva Production on Phonation in Patients With Ectodermal Dysplasia.

Semmler, Marion; Kniesburges, Stefan; Pelka, Franziska; Ensthaler, Maria; Wendler, Olaf; Schützenberger, Anne.

J Voice ; 37(6): 913-923, 2023 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-34353685

RESUMO

OBJECTIVE: Patients with ectodermal dysplasia (ED) suffer from an inherited disorder in the development of the ectodermal structures. Besides the main symptoms, i.e. significantly reduced formation/expression of teeth, hair and sweat glands, a decreased saliva production is objectively accounted. In addition to difficulties with chewing/swallowing, ED patients frequently report on the subjective impression of rough and hoarse voices. A correlation between the reduced production of saliva and an affliction of the voice has not yet been investigated objectively for this rare disease. METHODS: Following an established measurement protocol, a study has been conducted on 31 patients with ED and 47 controls (no ED, healthy voice). Additionally, the vocal fold oscillations were recorded by high-speed videoendoscopy (HSV@4 kHz). The glottal area waveform was determined by segmentation and objective glottal dynamic parameters were calculated. The generated acoustic signal was evaluated by objective and subjective measures. The individual impairment was documented by a standardized questionnaire (VHI). Additionally, the amount of generated saliva was measured for a defined period of time. RESULTS: ED patients displayed a significantly reduced saliva production compared to the control group. Furthermore, the auditory-perceptual evaluation yielded significantly higher ratings for breathiness and hoarseness in the voices of male ED patients compared to male controls. The majority of male ED patients (67%) indicated at least minor impairment in the self-evaluation. Objective acoustic measures like Jitter and Shimmer confirmed the decreased acoustic quality in male ED patients, whereas none of the investigated HSV parameters showed significant differences between the test groups. Statistical analysis did not confirm a statistically significant correlation between reduced voice quality and amount of saliva. CONCLUSIONS: An objective impairment of the acoustic outcome was demonstrated for male ED patients. However, the vocal folds dynamics in HSV recordings seem unaffected.

Assuntos

Displasia Ectodérmica , Saliva , Humanos , Masculino , Fonação , Prega Vocal , Qualidade da Voz , Rouquidão

8.

Long-term performance assessment of fully automatic biomedical glottis segmentation at the point of care.

Groh, René; Dürr, Stephan; Schützenberger, Anne; Semmler, Marion; Kist, Andreas M.

PLoS One ; 17(9): e0266989, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36129922

RESUMO

Deep Learning has a large impact on medical image analysis and lately has been adopted for clinical use at the point of care. However, there is only a small number of reports of long-term studies that show the performance of deep neural networks (DNNs) in such an environment. In this study, we measured the long-term performance of a clinically optimized DNN for laryngeal glottis segmentation. We have collected the video footage for two years from an AI-powered laryngeal high-speed videoendoscopy imaging system and found that the footage image quality is stable across time. Next, we determined the DNN segmentation performance on lossy and lossless compressed data revealing that only 9% of recordings contain segmentation artifacts. We found that lossy and lossless compression is on par for glottis segmentation, however, lossless compression provides significantly superior image quality. Lastly, we employed continual learning strategies to continuously incorporate new data into the DNN to remove the aforementioned segmentation artifacts. With modest manual intervention, we were able to largely alleviate these segmentation artifacts by up to 81%. We believe that our suggested deep learning-enhanced laryngeal imaging platform consistently provides clinically sound results, and together with our proposed continual learning scheme will have a long-lasting impact on the future of laryngeal imaging.

Assuntos

Laringe , Sistemas Automatizados de Assistência Junto ao Leito , Artefatos , Glote/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Laringe/diagnóstico por imagem , Redes Neurais de Computação

9.

A single latent channel is sufficient for biomedical glottis segmentation.

Kist, Andreas M; Breininger, Katharina; Dörrich, Marion; Dürr, Stephan; Schützenberger, Anne; Semmler, Marion.

Sci Rep ; 12(1): 14292, 2022 08 22.

Artigo em Inglês | MEDLINE | ID: mdl-35995933

RESUMO

Glottis segmentation is a crucial step to quantify endoscopic footage in laryngeal high-speed videoendoscopy. Recent advances in deep neural networks for glottis segmentation allow for a fully automatic workflow. However, exact knowledge of integral parts of these deep segmentation networks remains unknown, and understanding the inner workings is crucial for acceptance in clinical practice. Here, we show that a single latent channel as a bottleneck layer is sufficient for glottal area segmentation using systematic ablations. We further demonstrate that the latent space is an abstraction of the glottal area segmentation relying on three spatially defined pixel subtypes allowing for a transparent interpretation. We further provide evidence that the latent space is highly correlated with the glottal area waveform, can be encoded with four bits, and decoded using lean decoders while maintaining a high reconstruction accuracy. Our findings suggest that glottis segmentation is a task that can be highly optimized to gain very efficient and explainable deep neural networks, important for application in the clinic. In the future, we believe that online deep learning-assisted monitoring is a game-changer in laryngeal examinations.

Assuntos

Glote , Laringe , Endoscopia , Glote/diagnóstico por imagem , Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Gravação em Vídeo

10.

Re-Training of Convolutional Neural Networks for Glottis Segmentation in Endoscopic High-Speed Videos.

Döllinger, Michael; Schraut, Tobias; Henrich, Lea A; Chhetri, Dinesh; Echternach, Matthias; Johnson, Aaron M; Kunduk, Melda; Maryn, Youri; Patel, Rita R; Samlan, Robin; Semmler, Marion; Schützenberger, Anne.

Appl Sci (Basel) ; 12(19)2022 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-37583544

RESUMO

Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.

11.

OpenHSV: an open platform for laryngeal high-speed videoendoscopy.

Kist, Andreas M; Dürr, Stephan; Schützenberger, Anne; Döllinger, Michael.

Sci Rep ; 11(1): 13760, 2021 07 02.

Artigo em Inglês | MEDLINE | ID: mdl-34215788

RESUMO

High-speed videoendoscopy is an important tool to study laryngeal dynamics, to quantify vocal fold oscillations, to diagnose voice impairments at laryngeal level and to monitor treatment progress. However, there is a significant lack of an open source, expandable research tool that features latest hardware and data analysis. In this work, we propose an open research platform termed OpenHSV that is based on state-of-the-art, commercially available equipment and features a fully automatic data analysis pipeline. A publicly available, user-friendly graphical user interface implemented in Python is used to interface the hardware. Video and audio data are recorded in synchrony and are subsequently fully automatically analyzed. Video segmentation of the glottal area is performed using efficient deep neural networks to derive glottal area waveform and glottal midline. Established quantitative, clinically relevant video and audio parameters were implemented and computed. In a preliminary clinical study, we recorded video and audio data from 28 healthy subjects. Analyzing these data in terms of image quality and derived quantitative parameters, we show the applicability, performance and usefulness of OpenHSV. Therefore, OpenHSV provides a valid, standardized access to high-speed videoendoscopy data acquisition and analysis for voice scientists, highlighting its use as a valuable research tool in understanding voice physiology. We envision that OpenHSV serves as basis for the next generation of clinical HSV systems.

Assuntos

Glote/cirurgia , Doenças da Laringe/cirurgia , Laringoscopia/métodos , Laringe/cirurgia , Adolescente , Adulto , Feminino , Glote/diagnóstico por imagem , Glote/fisiopatologia , Humanos , Doenças da Laringe/diagnóstico por imagem , Doenças da Laringe/patologia , Laringoscopia/instrumentação , Laringe/diagnóstico por imagem , Laringe/patologia , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Gravação em Vídeo , Prega Vocal/diagnóstico por imagem , Prega Vocal/fisiopatologia , Prega Vocal/cirurgia , Voz/fisiologia , Distúrbios da Voz/diagnóstico por imagem , Distúrbios da Voz/fisiopatologia , Distúrbios da Voz/cirurgia , Qualidade da Voz/fisiologia , Adulto Jovem

12.

A Deep Learning Enhanced Novel Software Tool for Laryngeal Dynamics Analysis.

Kist, Andreas M; Gómez, Pablo; Dubrovskiy, Denis; Schlegel, Patrick; Kunduk, Melda; Echternach, Matthias; Patel, Rita; Semmler, Marion; Bohr, Christopher; Dürr, Stephan; Schützenberger, Anne; Döllinger, Michael.

J Speech Lang Hear Res ; 64(6): 1889-1903, 2021 06 04.

Artigo em Inglês | MEDLINE | ID: mdl-34000199

RESUMO

Purpose High-speed videoendoscopy (HSV) is an emerging, but barely used, endoscopy technique in the clinic to assess and diagnose voice disorders because of the lack of dedicated software to analyze the data. HSV allows to quantify the vocal fold oscillations by segmenting the glottal area. This challenging task has been tackled by various studies; however, the proposed approaches are mostly limited and not suitable for daily clinical routine. Method We developed a user-friendly software in C# that allows the editing, motion correction, segmentation, and quantitative analysis of HSV data. We further provide pretrained deep neural networks for fully automatic glottis segmentation. Results We freely provide our software Glottis Analysis Tools (GAT). Using GAT, we provide a general threshold-based region growing platform that enables the user to analyze data from various sources, such as in vivo recordings, ex vivo recordings, and high-speed footage of artificial vocal folds. Additionally, especially for in vivo recordings, we provide three robust neural networks at various speed and quality settings to allow a fully automatic glottis segmentation needed for application by untrained personnel. GAT further evaluates video and audio data in parallel and is able to extract various features from the video data, among others the glottal area waveform, that is, the changing glottal area over time. In total, GAT provides 79 unique quantitative analysis parameters for video- and audio-based signals. Many of these parameters have already been shown to reflect voice disorders, highlighting the clinical importance and usefulness of the GAT software. Conclusion GAT is a unique tool to process HSV and audio data to determine quantitative, clinically relevant parameters for research, diagnosis, and treatment of laryngeal disorders. Supplemental Material https://doi.org/10.23641/asha.14575533.

Assuntos

Aprendizado Profundo , Laringe , Glote , Humanos , Laringoscopia , Fonação , Software , Vibração , Gravação em Vídeo , Prega Vocal

13.

Fluid-structure-acoustic interactions in an ex vivo porcine phonation model.

Semmler, Marion; Berry, David A; Schützenberger, Anne; Döllinger, Michael.

J Acoust Soc Am ; 149(3): 1657, 2021 03.

Artigo em Inglês | MEDLINE | ID: mdl-33765793

RESUMO

In the clinic, many diagnostic and therapeutic procedures focus on the oscillation patterns of the vocal folds (VF). Dynamic characteristics of the VFs, such as symmetry, periodicity, and full glottal closure, are considered essential features for healthy phonation. However, the relevance of these individual factors in the complex interaction between the airflow, laryngeal structures, and the resulting acoustics has not yet been quantified. Sustained phonation was induced in nine excised porcine larynges without vocal tract (supraglottal structures had been removed above the ventricular folds). The multimodal setup was designed to simultaneously control and monitor key aspects of phonation in the three essential parts of the larynx. More specifically, measurements will comprise (1) the subglottal pressure signal, (2) high-speed recordings in the glottal plane, and (3) the acoustic signal in the supraglottal region. The automated setup regulates glottal airflow, asymmetric arytenoid adduction, and the pre-phonatory glottal gap. Statistical analysis revealed a beneficial influence of VF periodicity and glottal closure on the signal quality of the subglottal pressure and the supraglottal acoustics, whereas VF symmetry only had a negligible influence. Strong correlations were found between the subglottal and supraglottal signal quality, with significant improvement of the acoustic quality for high levels of periodicity and glottal closure.

Assuntos

Laringe , Fonação , Acústica , Animais , Glote , Pressão , Suínos , Prega Vocal

14.

Interdependencies between acoustic and high-speed videoendoscopy parameters.

Schlegel, Patrick; Kist, Andreas M; Kunduk, Melda; Dürr, Stephan; Döllinger, Michael; Schützenberger, Anne.

PLoS One ; 16(2): e0246136, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-33529244

RESUMO

In voice research, uncovering relations between the oscillating vocal folds, being the sound source of phonation, and the resulting perceived acoustic signal are of great interest. This is especially the case in the context of voice disorders, such as functional dysphonia (FD). We investigated 250 high-speed videoendoscopy (HSV) recordings with simultaneously recorded acoustic signals (124 healthy females, 60 FD females, 44 healthy males, 22 FD males). 35 glottal area waveform (GAW) parameters and 14 acoustic parameters were calculated for each recording. Linear and non-linear relations between GAW and acoustic parameters were investigated using Pearson correlation coefficients (PCC) and distance correlation coefficients (DCC). Further, norm values for parameters obtained from 250 ms long sustained phonation data (vowel /i/) were provided. 26 PCCs in females (5.3%) and 8 in males (1.6%) were found to be statistically significant (|corr.| ≥ 0.3). Only minor differences were found between PCCs and DCCs, indicating presence of weak non-linear dependencies between parameters. Fundamental frequency was involved in the majority of all relevant PCCs between GAW and acoustic parameters (19 in females and 7 in males). The most distinct difference between correlations in females and males was found for the parameter Period Variability Index. The study shows only weak relations between investigated acoustic and GAW-parameters. This indicates that the reduction of the complex 3D glottal dynamics to the 1D-GAW may erase laryngeal dynamic characteristics that are reflected within the acoustic signal. Hence, other GAW parameters, 2D-, 3D-laryngeal dynamics and vocal tract parameters should be further investigated towards potential correlations to the acoustic signal.

Assuntos

Disfonia/fisiopatologia , Glote/fisiopatologia , Laringoscopia/métodos , Acústica , Adulto , Idoso , Estudos de Casos e Controles , Feminino , Humanos , Laringoscopia/instrumentação , Masculino , Pessoa de Meia-Idade , Gravação em Vídeo , Qualidade da Voz , Adulto Jovem

15.

Rethinking glottal midline detection.

Kist, Andreas M; Zilker, Julian; Gómez, Pablo; Schützenberger, Anne; Döllinger, Michael.

Sci Rep ; 10(1): 20723, 2020 11 26.

Artigo em Inglês | MEDLINE | ID: mdl-33244031

RESUMO

A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection.

16.

7q31.2q31.31 deletion downstream of FOXP2 segregating in a family with speech and language disorder.

Rieger, Melissa; Krumbiegel, Mandy; Reuter, Miriam S; Schützenberger, Anne; Reis, André; Zweier, Christiane.

Am J Med Genet A ; 182(11): 2737-2741, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-32885567

RESUMO

Chromosomal 7q31 deletions have been described in individuals with variable neurodevelopmental phenotypes including speech and language impairment. These copy number variants usually encompass FOXP2, haploinsufficiency of which represents a widely acknowledged cause for specific speech and language disorders. By chromosomal microarray analysis we identified a 4.7 Mb microdeletion at 7q31.2q31.31 downstream of FOXP2 in three family members presenting with variable speech, language and neurodevelopmental phenotypes. The index individual showed delayed speech development with impaired speech production, reduced language comprehension, and additionally learning difficulties, microcephaly, and attention deficit. His younger sister had delayed speech development with impaired speech production and partially reduced language comprehension. Their mother had attended a school for children with speech and language deficiencies and presented with impaired articulation. The deletion had occurred de novo in the mother, includes 15 protein-coding genes and is located in close proximity to the 3' end of FOXP2. Though a novel locus at 7q31.2q31.31 associated with mild neurodevelopmental and more prominent speech and language impairment is possible, the close phenotypic overlap with FOXP2-associated speech and language disorder rather suggests a positional effect on FOXP2 expression and function.

Assuntos

Deleção Cromossômica , Cromossomos Humanos Par 7/genética , Fatores de Transcrição Forkhead/genética , Transtornos da Linguagem/patologia , Fenótipo , Distúrbios da Fala/patologia , Criança , Pré-Escolar , Feminino , Humanos , Transtornos da Linguagem/genética , Masculino , Linhagem , Distúrbios da Fala/genética

17.

Machine learning based identification of relevant parameters for functional voice disorders derived from endoscopic high-speed recordings.

Schlegel, Patrick; Kniesburges, Stefan; Dürr, Stephan; Schützenberger, Anne; Döllinger, Michael.

Sci Rep ; 10(1): 10517, 2020 06 29.

Artigo em Inglês | MEDLINE | ID: mdl-32601277

RESUMO

In voice research and clinical assessment, many objective parameters are in use. However, there is no commonly used set of parameters that reflect certain voice disorders, such as functional dysphonia (FD); i.e. disorders with no visible anatomical changes. Hence, 358 high-speed videoendoscopy (HSV) recordings (159 normal females (NF), 101 FD females (FDF), 66 normal males (NM), 32 FD males (FDM)) were analyzed. We investigated 91 quantitative HSV parameters towards their significance. First, 25 highly correlated parameters were discarded. Second, further 54 parameters were discarded by using a LogitBoost decision stumps approach. This yielded a subset of 12 parameters sufficient to reflect functional dysphonia. These parameters separated groups NF vs. FDF and NM vs. FDM with fair accuracy of 0.745 or 0.768, respectively. Parameters solely computed from the changing glottal area waveform (1D-function called GAW) between the vocal folds were less important than parameters describing the oscillation characteristics along the vocal folds (2D-function called Phonovibrogram). Regularity of GAW phases and peak shape, harmonic structure and Phonovibrogram-based vocal fold open and closing angles were mainly important. This study showed the high degree of redundancy of HSV-voice-parameters but also affirms the need of multidimensional based assessment of clinical data.

Assuntos

Endoscopia , Aprendizado de Máquina , Distúrbios da Voz/diagnóstico , Qualidade da Voz/fisiologia , Fatores Etários , Feminino , Humanos , Laringoscopia , Masculino , Distúrbios da Voz/fisiopatologia

18.

BAGLS, a multihospital Benchmark for Automatic Glottis Segmentation.

Gómez, Pablo; Kist, Andreas M; Schlegel, Patrick; Berry, David A; Chhetri, Dinesh K; Dürr, Stephan; Echternach, Matthias; Johnson, Aaron M; Kniesburges, Stefan; Kunduk, Melda; Maryn, Youri; Schützenberger, Anne; Verguts, Monique; Döllinger, Michael.

Sci Data ; 7(1): 186, 2020 06 19.

Artigo em Inglês | MEDLINE | ID: mdl-32561845

RESUMO

Laryngeal videoendoscopy is one of the main tools in clinical examinations for voice disorders and voice research. Using high-speed videoendoscopy, it is possible to fully capture the vocal fold oscillations, however, processing the recordings typically involves a time-consuming segmentation of the glottal area by trained experts. Even though automatic methods have been proposed and the task is particularly suited for deep learning methods, there are no public datasets and benchmarks available to compare methods and to allow training of generalizing deep learning models. In an international collaboration of researchers from seven institutions from the EU and USA, we have created BAGLS, a large, multihospital dataset of 59,250 high-speed videoendoscopy frames with individually annotated segmentation masks. The frames are based on 640 recordings of healthy and disordered subjects that were recorded with varying technical equipment by numerous clinicians. The BAGLS dataset will allow an objective comparison of glottis segmentation methods and will enable interested researchers to train their own models and compare their methods.

Assuntos

Endoscopia , Glote/fisiologia , Gravação em Vídeo , Prega Vocal/fisiologia , Distúrbios da Voz/diagnóstico , Glote/diagnóstico por imagem , Humanos , Prega Vocal/diagnóstico por imagem

19.

Analysis of the tonal sound generation during phonation with and without glottis closure.

Kniesburges, Stefan; Lodermeyer, Alexander; Semmler, Marion; Schulz, Yvonne Katrin; Schützenberger, Anne; Becker, Stefan.

J Acoust Soc Am ; 147(5): 3285, 2020 05.

Artigo em Inglês | MEDLINE | ID: mdl-32486803

RESUMO

The human phonation is characterized by periodical oscillations of the vocal folds with a complete glottis closure. In contrast, a glottal insufficiency (GI) represents an oscillation without glottis closure resulting in a breathy and weak voice. In this study, flow-induced oscillations of silicone vocal folds were modeled with and without glottis closure. The measurements comprised the flow pressure in the model, the generated sound, and the high-speed footage of the vocal fold motion. The analysis revealed that the sound signal for vocal fold oscillations without closure exhibits a lower number of harmonic tones with smaller amplitudes compared to the case with complete closure. The time series of the pressure signals showed small and periodical oscillations occurring less frequently and with smaller amplitude for the GI case. Accordingly, the pressure spectra include fewer harmonics similar to the sound. The analysis of the high-speed videos indicates that the strength of the pressure oscillations correlates with the divergence angle of the glottal duct during the closing motion. Physiologically, large divergence angles typically occur for a pronounced mucosal wave motion with glottis closure. Thus, the results indicate a correlation between the intensity of the mucosal wave and the development of harmonic tones.

Assuntos

Glote , Fonação , Humanos , Movimento (Física) , Som , Prega Vocal

20.

Determination of Clinical Parameters Sensitive to Functional Voice Disorders Applying Boosted Decision Stumps.

Schlegel, Patrick; Kist, Andreas M; Semmler, Marion; Dollinger, Michael; Kunduk, Melda; Durr, Stephan; Schutzenberger, Anne.

IEEE J Transl Eng Health Med ; 8: 2100511, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32518739

RESUMO

BACKGROUND: Various voice assessment tools, such as questionnaires and aerodynamic voice characteristics, can be used to assess vocal function of individuals. However, not much is known about the best combinations of these parameters in identification of functional dysphonia in clinical settings. METHODS: This study investigated six scores from clinically commonly used questionnaires and seven acoustic parameters. 514 females and 277 males were analyzed. The subjects were divided into three groups: one healthy group (N01) (49 females, 50 males) and two disordered groups with perceptually hoarse (FD23) (220 females, 96 males) and perceptually not hoarse (FD01) (245 females, 131 males) sounding voices. A tree stumps Adaboost approach was applied to find the subset of parameters that best separates the groups. Subsequently, it was determined if this parameter subset reflects treatment outcome for 120 female and 51 male patients by pairwise pre- and post-treatment comparisons of parameters. RESULTS: The questionnaire "Voice-related-quality-of-Life" and three objective parameters ("maximum fundamental frequency", "maximum Intensity" and "Jitter Percent") were sufficient to separate the groups (accuracy ranging from 0.690 (FD01 vs. FD23, females) to 0.961 (N01 vs. FD23, females)). Our study suggests that a reduced parameter subset (4 out of 13) is sufficient to separate these three groups. All parameters reflected treatment outcome for patients with hoarse voices, Voice-related-quality-of-Life showed improvement for the not hoarse group (FD01). CONCLUSION: Results show that single parameters are insufficient to separate voice disorders but a set of several well-chosen parameters is. These findings will help to optimize and reduce clinical assessment time.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA